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Abstract 



' We investigate the performance of the Neyman-Pearson detection of a stationary Gaussian process 

(N : 

in noise, using a large wireless sensor network (WSN). In our model, each sensor compresses its 



observation sequence using a linear precoder. The final decision is taken by a fusion center (FC) based on 
the compressed information. Two families of precoders are studied: random iid precoders and orthogonal 
precoders. We analyse their performance in the regime where both the number of sensors k and the 
\ number of samples n per sensor tend to infinity at the same rate, that is, fc/n ^ c e (0, 1). Contributions 

. are as follows. 1) Using results of random matrix theory and on large Toeplitz matrices, it is proved 

■ that the miss probabiUty of the Neyman-Pearson detector converges exponentially to zero, when the 

■"sj" ' above famihes of precoders are used. Closed form expressions of the corresponding error exponents are 

, provided. 2) In particular, we propose a practical orthogonal precoding strategy, the Principal Frequencies 

. Strategy (PFS), which achieves the best error exponent among all orthogonal strategies, and which 



requires very few signaling overhead between the central processor and the nodes of the network. 3) 
Moreover, when the PFS is used, a simplified low-complexity testing procedure can be implemented 



H ' at the FC. We show that the proposed suboptimal test enjoys the same error exponent as the Neyman- 

Pearson test, which indicates a similar asymptotic behaviour of the performance. We illustrate our 
findings by numerical experiments on some examples. 



I. INTRODUCTION 

The design of powerful tests allowing to detect the presence of a stochastic signal using 
large WSN's is a crucial issue in a wide range of applications. We investigate the Neyman- 
Pearson detection of a Gaussian signal using a wireless network of k sensors. Each sensor 

The authors are with Institut Telecom / Telecom ParisTech / CNRS LTCI, France, 
e-mails: {bianchi, jakubowi, roueff }@telecom-paristech . fr 



DRAFT 



2 



SUBMITTED TO IEEE TRANSACTIONS ON SIGNAL PROCESSING 



observes a finite sample of tlie signal of interest, corrupted by additive noise, and then forwards 
some information towards the FC which takes the final decision. Neyman-Pearson detection of 
Gaussian signals using large sensor networks has been thoroughly investigated in the literature 
(see for instance [[U, O and references therein). In such works, the FC is assumed to have a 
perfect knowledge of the observation sequence of each sensor. Unfortunately, in a WSN, the 
amount of information forwarded by each sensor node to the FC is usually limited, due to channel 
capacity constraints. Thus, in practice, each sensor node must compress its information in some 
way before transmission to the FC. This compression step of course degrades the performance 
of the detection. A large number of works has been devoted to the determination of relevant 
compression strategies, essentially within the framework of distributed detection [[3]|, flU. In these 
works, the data is locally processed by each sensor: Typically, a local Neyman-Pearson test is 
made by each node, based on the knowledge of the probabilistic law of the source to be detected. 
Unfortunately, such approaches require at the same time that each sensor possesses a significant 
computational ability allowing involved processing of its data, and that each sensor has a full 
knowledge of the source statistics. On the opposite, this paper investigates the case of dumb 
WSN. By this term, we refer to the case where: 

• Individual sensor nodes are not aware of their mission and their environment. They process 
the observed data with no or few instructions from the central processor. 

• The processing abilities of each sensor node are limited due either to hardware or energy 
constraints. 

Dumb WSN are of practical interest because they are simple, flexible (i.e., easily reconfigurable 
as a function of the sensor network's mission) and avoid an excess of signaling overhead in the 
network. 

The aim of this paper is to propose and to study different compressing strategies which satisfy 
the above constraints and which are attractive in terms of detection performance. The paper is 
organized as follows. Section |II] introduces the signal model. Each sensor is assumed to observe 
n noisy samples of a stationary (correlated) Gaussian source. The spectral density / of the source 
is known at the FC but is unknown at the sensor nodes. The aim is to detect the presence of the 
source. To that end, each node forwards a compressed version of its observed sequence to the 
FC. In our model, the latter compression is achieved through simple (linear) processing of the 
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data, allowing this way for low cost implementation. We refer to this step as linear precoding. 
Section Un] introduces the problem of the detection of the presence of the source (hypothesis Hi) 
versus the hypothesis that only thermal noise is observed (hypothesis Hq). It is well known that 
a uniformly most powerful (UMP) test is obtained by the celebrated Neyman-Pearson procedure. 
The corresponding test is derived in Subsection IIII-A[ Intuitively, the good detection performance 
of the Neyman-Pearson test fundamentally relies on the relevant selection of the linear precoders 
used at the sensor nodes. Useful families of linear precoders are introduced, namely random 
iid precoders and orthogonal precoders. The detection performance associated with each of 
these families is studied in the asymptotic regime where both the number k of sensors and the 
number n of observations per sensor tend to infinity at the same rate (/c,n — )■ oo, k/n — )■ c 
where c G (0, 1)). More precisely, we show in Section |IV] that for any fixed a G (0, 1), the 
miss probability of the NP test of level a converges exponentially to zero. Error exponents are 
characterized and compared for the precoding strategies of interest. In particular, it is proved 
that the so-called Principal Frequencies Strategy (PFS) achieves the best error exponent among 
all orthogonal strategies. Numerical computations of all the obtained error exponents on some 
examples conclude this section. In the case where PFS is used, a suboptimal (non UMP) test 
is proposed in Section |Vl Based on the proof of a Large Deviation Principle governing the 
proposed test statistics, it is shown that our suboptimal test achieves the same error exponent as 
the Neyman-Pearson test. Finally, Section |Vl] is devoted to the simulations. 



Notations 



Column vectors are represented by bold symbols. Notation \\y\\ denotes the Euclidean norm 
of vector y. We denote by Leb the Lebesgue measure restricted to [— 7r,7r]. For any function 
/ : [— 7r,7r] — t- M, we use notation f~^{A) = {uj E \—ti,ti] : f{uj) G A} for the inverse image 
of A, and we denote by Leb o /^^ the image measure of Leb by /, i.e. which composes Leb 
with /~^. For any square matrix M, p{M) denotes its spectral radius. Finally, denotes the 
k k identity matrix. 
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II. The Framework 

A. Observation model at the sensor nodes 

Consider a set of k sensors whose aim is to detect the presence of a certain source signal 
x(0), x(l), x(2) .... Each sensor i = 1 . . .k collects n noisy samples of the source signal. We 
assume that n > k. Denote by = [yi{0), . . . , yi{n — 1)]^ the n x 1 data vector observed by 
sensor i. For each i = 1, . . . , k, we consider the following signal model: 

y, = x + Wi, (1) 

where x = [x(0), . . . , a;(n — 1)]^ contains the time samples extracted from a zero mean stationary 
Gaussian process x with known spectral density function /(w), u G [— 7r,7r). Vector Wi = 
[wi{0), . . . ,Wi{n — 1)]^ is a zero mean white Gaussian process which stands for the thermal 
noise of sensor i. We denote by cr^ the variance of Wi{0) which is assumed to be the same for 
all i. Random vectors x, wi, . . . , Wk are supposed to be independent. In the usual framework 
of Gaussian source detection, the aim is to detect whether the signal x of interest is present. 
Formally, this reduces to the following hypothesis testing problem: 

Hi: y^ = x + Wi, Vz = 1 . . . /c 
Hq: yi = Wi, \/i = l...k . 

In this paper, we make the following technical assumptions on the spectral density /: 

Al. The spectral density f is continuous on [— tt, tt] . 

A2. Measure Leb o does not put mass on points. 
Assumption A2 says that / cannot be constant over a set of positive Lebesgue measure (say, 
an interval of positive length). This e.g. rules out a white noise for x. On the other hand any 
ARMA process x that is not a white noise satisfies Assumptions Al and A2. 

B. Assumptions and constraints on the network 

We assume that the decision is taken by a distant node (the fusion center). The latter is 
supposed to have a perfect knowledge of the noise variance cr^ and of the spectral density / 
of the signal x to be detected. In this paper, we are interested in WSN satisfying the following 
constraints. 
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Fig. 1. Sensor network using linear precoding at the nodes. 



1) Communication constraint: In an ideal WSN architecture, each sensor i = 1, . . . ,k would 
transmit all available observations ?/j(0), . . . , — 1) to the FC. Unfortunately, perfect forward- 
ing of the whole information sequence by each sensor i is impractical in a large number of 
situations, the amount of information transmitted by each sensor node to the fusion center being 
usually limited. In this paper, we consider the case where only a compressed version of is 
likely to be forwarded. More precisely, we assume that each sensor i forwards a single scalar 
Zi to the fusion center, where Zi is a certain mapping of the sequence y^ received by sensor i. 

2) Signaling overhead constraint: Depending on the particular mission of the network or on 
the particular spectral density / to be detected, the network should be easily reconfigurable using 
a limited number of feedback bits from the fusion center to the sensors. In the sequel we assume 
that the spectral density / is known at the fusion center but is unknown (or at most partially 
known) at the sensor nodes. 

3) Complexity constraint: Only low complexity data processing is likely to be implemented 
at the sensors' side. More precisely, we assume that each sensor node i = 1 . . .k forwards a 
linear combination 

Zi = ajyi (2) 

of its observation sequence y^ to the fusion center, where aj is a n x 1 vector to be determined. 
Figure [H provides an illustration of the sensing scheme. Such a set of vectors ai, . . . ,0^ will 
be refered to as a linear precoder. The n^k matrix An = [ai, . . . , a^] will be refered to as the 
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precoding matrix. 

III. Likelihood Ratio Test 

A. Expression of the Likelihood Ratio 

We denote by Pq and Pi the probability under Hq and Hi and by Eq and Ei the corresponding 
expectations. Denote hy z = \zi, . . . , ZkY the available A; x 1 observation vector at the fusion 
center, where for each i, Zi is defined by Q. We denote by ■.M!' ^ and pi : M'^ — )• M+ 
the joint probability density function of zi,. . .,Zk under hypotheses Hq and Hi respectively. 
Due to the celebrated Neyman-Pearson's Lemma, the Likelihood Ratio Test (LRT) is uniformly 
most powerful. The LRT rejects the null hypothesis for large values of the log-likelihood ratio 
(LLR) defined by: 

= log . (3) 

In the above definition, the lowerscript An has been introduced to recall that the distribution of the 
random variable Ca„ depends on the particular choice of the precoding matrix An = [ai, . . . , a^]. 
We now derive a closed form expression of the LLR It is worth noting that multiplying 
each ai by a non-zero constant does not modify the performance of the likelihood ratio test. 
Hence we may normalize An so that ||aj|| = 1 for each i in the following. In this case, z is a 
zero mean Gaussian random vector with covariance matrix 

¥.i{zz^) = AlTnAn + aHk 

under hypothesis Hi, where r„ = ¥.i{xx^) represents the n x covariance matrix of vector x. 
Matrix r„ is the n x n Toeplitz matrix associated to the spectral density / of process x, namely. 



Tn{f) 



(4) 

l<k,l<n 

Under Hq, the covariance matrix of vector z simply coincides with Kq^zz'^) = a'^Ik- Using 



these remarks, it is straightforward to show that 

WzP 

2CA„ = k log + ^ - log det{AlTnAn + a^h) - z'^iAlVnAn + a^hy^z . (5) 

In the sequel, we assume as usual that the threshold of the test, say 7^, is fixed in such a way 
that the probability of false alarm (PFA) does not exceed a level a (0 < a < 1), which reads 

MCa^ > 7n) < « . (6) 
We now analyze the miss probability of the above LLR test as a function of 

DRAFT 



BIANCHI£7:AL. 



7 



B. Introduction to error exponents 

Let Pm{ci] ^n) denote the miss probability of the LLR test with level a based on the obser- 
vation zi,...,Zk: 

P]^(«;A„) = inf Pi(£a„ <7n) , 

where the inf is taken over all threshold values 7„ verifying the PFA constraint Q. The miss 
probability is generally the key metric to characterize the performance of hypothesis tests. 
Unfortunately, an exact expression of the miss probability as a function of An is difficult to 
obtain in the general case. Following [5], we thus analyze the asymptotic behaviour of the miss 
probability as the number of available observations tends to infinity. More precisely, we study 
the asymptotic regime where both the number of sensors k and the number of observations n 
per sensor tend to infinity at the same rate: 

n — )■ oo, A; — J- oo, — — )■ c (7) 
n 

where c G (0, 1). Any sequence ofnxk precoding matrices A = (An)n>o will be refered to as 
a linear strategy. Loosely speaking, we will prove that, at least for certain linear strategies of 
interest, the miss probability behaves as 

in the asymptotic regime ©, where Ka{A) is a certain constant which depends on the linear 
strategy but, as a matter of fact, does not depend on the level a. Such a constant is called the 
error exponent. It is a key indicator of the way the power of the test is influenced by the chosen 
linear strategy. More formally, we define for each A, 

K^{A) = liminf--logPl^(a;v4„) , (8) 
Ka{A) = limsup--logP]'^(a; A„) , (9) 

fc— >oo TT- 

and we define the error exponent of A as Ka{A) = I^ai^) — ^aiA) as soon as ([8]) and Q 
coincide. In the sequel, our aim is therefore to determine linear precoding strategies A having 
a large error exponent KaiA) (and for which KaiA) is well-defined, of course). The following 
Lemma (see [5J) provides a practical way to evaluate error exponents. 
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Lemma 1 ((H) The following inequalities hold: 
K^{A) > sup it : liminf Po 

Ka i-A) < sup < t : lim sup Pq 



- log —r^ < t 

n pi[z) 

- log —r^ < t 
n pi[z) 



< a 



< a 



In particular if, under hypothesis Hq, —n ^ Ha^ converges in probability to a deterministic 
constant ^, then Ka{A) = i£ai-A) — Ka{A.) = C, is necessary equal to this limit. 

According to the above lemma, the asymptotic performance analysis of the LLR test reduces 
to the characterization of the limit in probability of the normalized LLR as n — )■ oo, as soon as 
this limit exists. Moreover, in this case, the error exponent Ka is independent from level a. 



C. Some families of precoders 

A natural approach to design relevant precoders would be to characterize the linear strategies 
A which maximize the limit in probability (if it exists) of the LLR Ca„ as n,k oo. Ideally, this 
would lead to the strategies with maximal error exponent. Unfortunately, such a characterization 
is difficult and would moreover lead to linear strategies which would deeply depend on the 
spectral density / of the signal to be detected. The practical implementation of such optimal 
linear strategies would typically require to communicate the whole function / to each sensor via 
a feedback link from the central processor. In this paper, we focus on the opposite on the case 
of "dumb" sensors i.e., sensors which are able to process information with few or no knowledge 
of their mission or their environment. More precisely, we separately study the following linear 
strategies. 

1) Random iid precoders: A natural way to design dumb sensor networks is to select each 
sensor's precoder at random, independently from the network's mission. Motivated by first 
by the simplicity of the approach and second by its widespread use in compressive sensing 
applications we assume that matrix An is one realization of a n x k random matrix 
with zero mean iid entries. In the case of random iid precoders, sensors are able to precode 
their information without any instructions from the fusion center. 

2) Orthogonal precoders: In this case, matrix An is such that A^An = Ik i-e., the precoders 
ai, . . . , Ofc are orthogonal. Orthogonal precoders will reveal useful for the design of dumb 
but nevertheless efficient sensor networks. Indeed, under this constraint, we are able to 
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exhibit strategies that achieve the best error exponent. In addition, when such precoders 
are used, we will show that a low complexity testing procedure can be implemented as an 
alternative to the costly Likelihood Ratio Test, without decreasing the error exponent. 

IV. Error Exponents 

A. Case of random iid precoders 

Before stating the main result of this subsection, remark that the performance of the test is of 
course expected to depend on the covariance matrix r„ of the signal to be detected. In particular, 
it is useful to recall some well known results on the behaviour of the eigenvalues of r„. From 
classical results on large Toeplitz matrices 0, it is known that r„ can be approximated by a 
circulant matrix with eigenvalues /(O), /(^), • • • , /(^^^^^-^)- More precisely, for any Hermitian 
nxn matrix Q, we denote by Fqit) = ^^'^'^^^'^-^^ the distribution function associated with the 
empirical distribution of the eigenvalues Xi{Q), . . . , \n{Q) of Q (the corresponding probability 
measure is often refered to as the spectral measure of Q). Szego's Theorem ([7J, p. 64) states 
that, provided that Assumption Al holds, Fr„ converges weakly to the distribution function $ 
defined by: 

$(t) = i_Lebo/-i((-oo,t]) , (10) 

where we recall that Leb o /^^ is the measure which composes the Lebesgue measure Leb on 
[— TT, Tx] with /^^ (the inverse image under /). The error exponent merely depends on the latter 
limiting spectral measure $, as stated by the following Theorem. 

Theorem 1 Suppose that ([71) holds for some c G (0, 1) and assume Al. For each n, let An = 
{Afj) be a n X k real random matrix such that A^j for all n,i,j are iid zero mean random 
variables with finite second order moment. Consider any fixed level a G (0, 1). Then the linear 
strategy A = (A„)„ admits an error exponent Ka{A) = K^ndic) given by: 

i^rnd(c) = -c + a^c^ - I \og{a^P) + log(l + ct/3)d$(t) , (11) 

where /3 is the unique solution to the following equation: 

The proof is provided in Appendix lA-AI 
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B. Case of orthogonal precoders 

We now focus on the case where A'^An = Ik- Our aim is first to prove that among all 
orthogonal strategies, we may determine some that achieve the maximum error exponent and 
second, to determine this maximum error exponent. Results are provided below in Theorem [2l 
We first provide some definitions along with some insights on the results. 

Loosely speaking, it is easy to think of a relevant orthogonal strategy as follows. Focus 
on one given sensor i = 1 ... A; for the sake of simplicity. Under Hq, the received sequence 

= Wi corresponds to a white Gaussian noise of variance cr^. Therefore the law of Zi = afy^ 
is A/'(0,cr^). Under Hi, it is straightforward to show that Zi ~ J\f{0, ajTnCLi + cr^), where we 
recall that r.„ = 'E{xx'^) is the signal covariance matrix. Clearly, the best way for the sensor 
i to discriminate Hi versus Hq is to chose the precoder which maximizes the variance 
afVnai + cr^. This is achieved when coincides with the eigenvector of r„ associated with the 
largest eigenvalue. Generalizing this remark to k sensors, it is natural to introduce the strategy 
for which the k precoders ai . . .a^ coincide with the k eigenvectors of r„ associated with the 
largest eigenvalues. We shall refer to this strategy as the Principal Component Strategy (PCS). 

Definition 1 (principal components strategy (PCS)) Let (w")i<j<n be the eigenvectors ofVn 
and (A")i<i<„ be the corresponding eigenvalues, ordered in such a way that Ai > A2 • ■ ■ > A„. 
The principal component strategy V is defined as the sequence of n x k matrices Vn given by: 

K = [<,...,<] . 

As will be stated by Theorem [2] below, PCS achieves the maximum error exponent among 
all orthogonal strategies. Unfortunately, exact PCS might be difficult to implement in a dumb 
sensor network, as each node needs to be informed of a whole eigenvector of the covariance 
matrix r„. This requires involved cooperation between the nodes and the fusion center. In order 
to reduce the amount of overhead in the network, we propose an alternative strategy which turns 
out to achieve the same error exponent as PCS. Let = [-Fn(^5 j)]o<t,j<n-i denote the n x n 
real- valued orthogonal Fourier basis matrix, that is Fn = [cq, . . . , e"_]^], where the columns 
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are defined, up to a normalizing constant by 




.1 for j = 1 
1 for j = 1 



,...,Ln/2j 
,...,L(n-l)/2j . 



The main idea is to remark that for large n, the covariance matrix r„ can be approximated by 
the matrix 



see [|7l, (HI for more details. As a consequence, it seems reasonable to propose a strategy inspired 
of PCS, only substituing the above matrix (fT3l) with the true covariance matrix r„. This leads 
to the following definition. 

Definition 2 (principal frequencies strategy (PFS)) For each n, denote by (j", • • • ,jn) any 

permutation o/ {0, 1, . . . , — 1} such that f{27rj^/n) > ■■■ > /(27rj"/?7,). The principal 
frequencies strategy W is defined as the sequence of n x k matrices Wn given by: 



where e", . . . , are the columns of matrix F^. 

Note that PFS only requires to transmit one of the k indices j" . . . corresponding to the 
principal frequencies of / to each sensor. In return, each sensor i computes the scalar product 
between the j"th column of Fourier matrix F„ and its received sequence y^. In other words, it 
computes the value of the (real) periodogram of at frequency 27iff/n. The following result 
proves furthermore that both PCS and PFS achieve the best error exponent among all orthogonal 
strategies. 

For any c G (0, 1) denote by Ac the following set of frequencies: 



It is worth noting that the Lebesgue measure of Ac is equal to 27rc (see Lemma [5]). 

Theorem 2 Suppose that ([7|) holds for some c G (0, 1). and assume Al and A2. Let V and W 

respectively denote the PCS and PFS as defined above. For any a G (0, 1), the error exponents 



F„diag(/(0).../(27r(n-l)/n))Fj, 



(13) 




(14) 



Ac = {cu G (-TT, tt) : $ o f{uj) > 1 - c} . 



(15) 
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Kaiy) <^nd KaiyV) associated with V and W exist, and are such that K^iV) = KaiW) = 
Korth{c) where 



where D denotes the Kullback-Leibler contrast. Moreover, for any orthogonal strategy A, 



The proof is provided in Appendix lA-BI Let us briefly comment the best error exponent 
formula (fT6l) . First we recall that for any (Xi, cr| > 0, 



which is increasing as cFi/a2 gets away from 1 from above or below. Since $ is nondecreasing, we 
see that the frequencies uo lying in Ac are those that maximize D (A/'(0, cr^) 1 1 A/'(0, f{uj) + a^)) 
in [— TT, tt]. Thus Korth{c) can be interpreted as some distance between the two spectral densities 
cr^ (corresponding to Hq) and f + a'^ (corresponding to Hi) restricted to a set of frequencies 
where these two spectral densities are the furthest apart. 

C. Illustration and comparisons 

Error exponents A'orth and defined in sections IIV-BI and IIV-AI depends on the following 
parameters: the spectral density /, the noise level a, along with the sensors growth ratio c. When 
using the orthogonal strategy, one can expect that the more peaky / is, the more efficient the 
compression will be. That is, by using only a few sensors configured at the peak frequencies, 
one will get a attractive exponent error. This should also lead to a sharp increase of the error 
exponent curve J<'orth(c) for small c. On the contrary, when / is nearly flat (with a small 
range of values), there are no priviledged frequencies for the sensors to forward and the error 
exponent should increase slowly as c gets larger. Let us illustrate these intuitive arguments with 
numerical experiments. We consider two spectral densities corresponding to ARMA processes. 
The corresponding plots are depicted in Fig. [21 




(16) 



(17) 



D{J\f{0,al)\\Af{0,al))=-l log ^ + 1 - ^ 



1— ^ cxp(ja;)+^ exp(2iaj) 



2 



1 — i cxp(ia;) — i cxp{2iuj)— exp{3iuj) 
i+jQ exp(ia;) — i exp(2iL»;) 
1+1 exp(ittj) — ^ cxp{2iu]) 
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Fig. 2. Left: Spectral density /i with si adjusted such that J^^ fi = 1. Right: Spectral density /2 with S2 adjusted such 
that 2^ /^^ /2 = 1- One can notice that /i has a sharp pealc while /2 takes its values in much smaller range. 



K(c) for various strategies 



K(c) for various strategies 




0.4 0.6 0.8 

c: Sensors Growth Rate 




0.4 0.6 0.£ 

c; Sensors Growth Rate 



Fig. 3. Error Exponents Knd{c), Koith{c) and K{st{c) as functions of the growth ratio c — limfc/n, for spectral density 
functions /i (left) and /2 (right). 



Fig. [3] represents i^iid(c) and Korth(c) for a = 1. For comparison, we also plotted another error 
exponent curve, corresponding to an orthogonal, yet suboptimal strategy which uses precoding 
matrices A„ = [1^ 0]. This strategy amounts to keep only the first k values of the signal, 
independently of /. It is straightforward to prove that the corresponding error exponent writes 

Kfst(c) = C ■ i^orth(l)- 

One can notice several numerical facts on Fig. [3l First, as expected from section HV-B[ Ki^t 
is always below i^orth- Remark that, as expected, K^nh has a sharper increase near c = when 
used with /i than when used with /2. The fact that the random iid strategy seems to behave 
better for c close to 1 is more surprising but it reveals the following interesting fact: in some 
circumstances, a non-orthogonal strategy may outperform an optimal orthogonal strategy. Let 
us try to interpret this result. When setting up the sensor network design, one faces a tradeoff. 
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K(c) for various strategies 



K(o) for various strategies 
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K(c) for various strategies 
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K(c) for various strategies 
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c: Sensors Grow'thi Rate 




0.4 0.6 O.f 

c: Sensors Grovrtfi Rate 



Fig. 4. Error Exponent curves for spectral density functions /i (left) and /2 (right) with = 1/2 (top) and o"^ = 4 (bottom). 

Either use an extra sensor over the same frequency that the previous sensor in order to denoise 
their common measurment, or use this extra sensor over a new frequency in order to discover 
another part of the spectral density /. At small levels of noise, it is always more interesting to 
discover / at new frequencies than to denoise ones already used by other sensors; indicating 
that the orthogonal strategy is always the best. But for high levels of noise, it may become more 
efficient to repeat (and thus denoise) key frequencies than to discover less important ones. To 
support this claim, we refer to Fig. |4] where we chose two levels of noise, one that is larger 
(cr^ = 4) than the one used in Fig. [3l and one that is smaller (a^ = .5). One can see that 
when (7^ = .5, the best orthogonal strategy outperforms the two others, whereas for = 4 the 
upcrossing of i^iid over /iorth near c = 1 is more important than on Fig.[3l The same conclusions 
hold for both spectral densities fi and /2. 

V. A PFS-BASED LOW COMPLEXITY TEST 

Results of the previous section indicate that the principal frequencies strategy is a good 
candidate for implementation in dumb sensor networks. Indeed it requires only few cooperation 
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between the nodes and the fusion center, and is attractive from an error exponent perspective. In 
this section, we prove furthermore that when PFS is used, then a test procedure can be proposed 
which is much less complex than the LRT, and which achieves nevertheless the same error 
exponent. 

We assume throughout this section that PFS is used i.e., each precoding matrix is given by 
An = Wn where Wn is defined by (fT4)) . 

A. A low complexity test 

Recall that the LRT rejects the null hypothesis when the LLR (|5]) is above a threshold. As the 
terms fclogcr^ and logdet(A^r„yl„ + a'^Ik) are constant w.r.t. the observation z, it is clear that 
the LRT reduces to the test which rejects the null hyopthesis for large values of the statistics: 

WzP 

^ - z^{AlTnAn + a'^hY^z . (18) 

Unfortunately, the evaluation of the above statistics is computationally demanding as k gets 
larger, since it requires the inversion of the k^ik matrix A^TnAn + cy^Ik- In order to avoid this, 
we propose to replace matrix r„ in (fTSi) with its circulant approximation given by (fT3l) . In other 
words, product A'^TnAn is replaced by: 

A^F^ diag (/(O) . . . /(27r(n - l)/n)) = diag (/(27rjT/n) . . . /(27rj,7n)) . 

This leads directly to the following procedure. 

PFS low complexity (PFSLC) Test: Reject hypothesis Hq when the statistics Tn defined by: 

^ (72 + /(27rj;vn)y 

is larger than a threshold. 

Although this statistics cannot give rise to a better test than the LRT, its numerical simplicity 
makes it worth to be considered. In the next paragraph, we study the performance of the test 
and we prove that it performs as well as the LRT in terms of error exponent. 

B. Asymptotic optimality of the PFSLC test 

As the statistics (fT9l) is no longer a likelihood ratio. Lemma [T] cannot be used to evaluate 
the error exponent associated with the test ( fT9l ). Instead, we must resort to arguments of large 
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deviations theory. Specifically, we shall study the large deviation behaviour of the test associated 
to this statistic, that is the limit of —n~^ logPi(7^i < //„(«)) where is the (1 — a)-quantile 

of the statistic Tn under Hq, Po(7^ > ?7n(")) = Under mild assumptions, we show below 
that this limit is given by the error exponent of the PFS. Hence, as far as error exponents are 
considered, there is no loss in the performance in using the statistic defined in (fT9l ) rather 
than the likelihood ratio. 

Theorem 3 Assume that Al and A2 hold true. For any level a G (0, 1), the statistics %i 
defined in f [79l) satisfies the following property. For rin{a) such that Po('7^ > '7n(")) = fh^ 
miss probability Pi (7^ < Tin{a)) satisfies 

lim -- log Pi (T; < ??„(«)) = i^orth • 
The proof of this result is provided in Appendix IA-C[ 

VI. Simulations 

The error exponent theory is inherently asymptotic. In this section we provide numerical 
experiments to analyze the performance of the PFS on simulated data for finite n since we have 
already proved that the error exponent curve is the same. The point here is to test how well the 
error exponent theory is relevant for finite n. 

We use the same spectral density functions /i and /2 as in section HV-Cl whose error exponents 
are displayed in Fig. [3l We now compare, for a couple of values for c, the finite sample 
performances of the LRT with the iid, PFS and PCS Strategies by using their empirical Receiver 
Operating Characteristic (ROC) curves. When the PFS is used, we also consider the PFSLC test 
of section |Vl We have shown that the LRT with the PFS or the PCS and the PFSLC test share 
the same error exponent curve. How well this measure of the performance impacts the whole 
ROC curves at finite samples is displayed in Fig. [51 It turns out that the PCS, the PFS and 
the PFSLC have similar ROC curves, as indicated by the error exponent analysis. One can also 
notice the good performance of the PFS, PFSLC and PCS when c = .1, a = 1 and n = 100 for 
/i, which confirms the conclusions drawn from the error exponent curves in Fig. [3l For c = .9, 
a = 2 and n = 100, one can notice that error exponent curves also provide a good prediction: 
the iid strategy slightly outperforms the PFS, PCS and PFSLC. 
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ROC curves (c = .1, n = 100) 



ROC curves (c = .1, n = 100) 





ROC curves (c = .9, n = 100) 



ROC curves (c = .9, n = 100) 





Fig. 5. ROC curves associated to the PFS, PCS, PFSLC and Random iid Strategy for spectral density functions /i (left) and 
/2 (right). Top: c = .1 and a = 1. Bottom c = .9 and ct = 2. As predicted by the error exponents curves, iid strategy is less 
efficient when a = 1 but slightly more efficient when ct = 2 for large values of c. 



VII. Conclusion 

In this paper, we studied the performance of the Neyman-Pearson detection of a stationary 
Gaussian process in noise, using a large wireless sensor network (WSN). Our results are relevant 
for the design of sensor networks which are constrained by limited signaling and communication 
overhead between the fusion center and the sensor nodes. We studied the case where each 
sensor compresses its observation sequence using either a random iid linear precoder or an 
orthogonal precoder. In the random precoder case, we determined the error exponent governing 
the asymptotic behaviour of the miss probability, when k,n oo and k/n — )■ c G (0,1). In 
the orthogonal precoder case, we exhibit strategies (PCS and PFS) that achieve the best error 
exponent among all orthogonal strategies. The PFS has moreover the attractive property of being 
well suited for WSN with signaling overhead constraints. In addition, we proved that when the 
PFS is used, a low complexity test can be implemented at the FC as an alternative to the 
Likelihood Ratio (Neyman-Pearson) test. Interestingly, the proposed test performs as well as the 
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LRT in terms of error exponents. 
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Appendix A 
Proofs of main results 

Observe that we may set = 1 without loss of generality, since it amounts to divide / by 
cr^ and the data y- by a. Hence in the following proof sections, we assume ci = 1. In particular 
the LLR in ([5]) for a precoding matrix An with normalized precoders a^, i = 1, . . . ,k is given 
by 

2£^„ = \\zf - logdet(A^r„A„ + 4) - z^iAlTnAn + h)z . (20) 

A. Proof of Theorem [7] 

We assume without restriction that E((ylJ]^)^) = 1. Due to LemmafU it is sufficient to prove that 
the normalized LLR associated to strategy A converges in probability to the rhs of equation (fTT)) 
under Hq. Expression (l20l) of the LLR relies on the assumption that each precoder has unit 
norm, which is generally not the case for A^ defined as in Theorem [T] Since the false alarm and 
miss probabilities of this LRT do not depend on the norms ||aj||, z = 1, . . . , /c, it is equivalent 
to consider precoders defined by the matrix 

An = AnPn ^ , (21) 

where P„ = diag (Er=i(^ri)^ : j = 1, • • • , With this definition, we may use expression (|20|) 
which is valid for normalized precoders. In order to prove Theorem [H it is now sufficient to 
show that — (l/n)£^^ converges to the constant Krnd defined in (fTTI) . 

The main issue lies in the asymptotic study of the two terms ^ log det(/l^r„yl„ + J^) and 
^z'^{A^TnAn + Ik)^^z. This can be done by successively using the results of BUl, [[TOll and lITTIl . 
The crucial point is to characterize the limiting spectral measure of matrix A^TnAn. Define: 

F? — 4^r 4 
^ - -A'^r A 

n 
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First, we prove (see Lemma [2] below) that the spectral measure of /?„ is asymptotically close to 
the one of Sn in a sense which is made clear below. Second, we apply the results of [[Till, ifTOl 
along with [Q to determine the limiting spectral measure of Sn- Finally, closed form expressions 
of the desired quantities follow from the results of [fTTl . 

Denote by d the Levy distance on the set of distribution functions. We recall that Fq denotes 
the distribution function of the spectral measure of Q (see Section [IV-A[) . 

Lemma 1 As n,k ^ oo, di^F^^^, Fs„) converges to zero in probability. 

Proof: The proof relies on Bai's formula (see (I3l and ^) which provides the following 
bound on the Levy distance: 

d\F^T^, Fbtb) <\\A- B\WA\^ + \Bf) {22) 
n k 

for any n x k real matrices A, B, where \A\ = y^TvlA^A] denotes the Frobenius norm of A. 
We now use equation ([22l) with A ^ vl/'^An and B ^ -^T]!"^ An. Using (|2T]) and introducing 
A„ = [Pn^^'^ — -^InY, it is straightforward to show that: 

2n 1 
d\FR^,Fs,^) < — Tr[A„5„] (Tr[/^-^5„] + - Tr Sn) • (23) 

Note that ^TrS",! < ^^Tr[A^A„], where p(r„) denotes the spectral radius of r„. Similarly, 
TiP-^Sn = TiRn < p(r„) iTri^i„ = p(r„)^ Finally, Tr A„5„ < p(r„)p(^A^A„) Tr A„. 
Putting all pieces together, 

d\FR„,FsJ<^nTTAn (24) 

where 

Kn='^p{Tn)^- + \TTAj;An] p(-AlAn) . 

From [Q, p(r„) converges to Afj = sup(/). By the law of large numbers, ^ Ti An converges 
almost surely (a.s.) to c. From fl4\. p{^A'^An) converges a.s. to (1 + v^)^. Therefore, k„ 
converges a.s. to 4M|c(l + (l + -\/c)^). In order to prove that d{Fji^, Fs„) converges in probability 
to zero, it is sufficient, by equation (|24)) . to prove that TrA„ converges in probability to zero. 
We write Tr A„ as follows: 

1 *^ 
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where = (^{^T.7=M7j?y^^^ - l)^- Note that for a fixed n, Uj are iid for all j. Let 
e > 0. By Markov inequality, 

F (Tr A„ > e) < ^ = - . (25) 

ne n e 

As 1 converges a.s. to zero, we conclude that P (Tr A„ > e) tends to zero. This completes the 
proof of Lemma [H ■ 
Thanks to Lemma [2l it is sufficient to study the asymptotic behaviour of Fs,^. The latter is 
provided by [[T2|. [fTOl . In order to introduce this result, we need to recall some definitions. For 
any distribution function F, the Stieltjes transform bi? of F is given by: 

for each z E C"*", where C'^ = {z E C : '^(z) > 0} with Sj(^) denoting the imaginary part 
of z. Recall that, from the results of f7\, the spectral distribution function Fr„ of converges 
weakly to $ given by (flOl) . By straightforward application of the results of [[T2ll . ifTOll . we obtain 
that, with probability one, Fs^ converges weakly to a deterministic measure F whose Stieltjes 
transform b = h{z) is the unique solution in of: 

' = -f + /tt^*^*(*'- 

for each z E C+. The above result along with Lemma [2] implies that 

Ve>0, F{d{Fn„,F)>e)^0 . (27) 



We are now in a position to study the limit of the LLR. We obtain immediately from (|20|) : 

n 2n\ k 

where 



--^A^ = — \-^ + l^n + ln + Sn} (28) 



In = I log det (4 + Rn) = y log (1 + t) dFR„ (t) 
Using (l27l) . /3n and 7„ respectively converge in probability to the constants /3 and 7 defined by: 



/3 
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Recalling that z ~ A/'(0, //,.) under H^, the term in the rhs of (|28l) converges a.s. to 

one. Since z is independent of An and since the spectral radius of + 1^)'^ is bounded, it is 
straightforward to show that 5„ (use for instance Lemma 2.7 in IfTSl ). Finally, — (l/n) 
converges in probability to: 

^a(^) = |(-l + /5 + 7) • (29) 

Constant /3 coincides with the Stieltjes transform of F at point —1, that is /3 = b(— 1) where we 
defined for each x < 0, b(x) = lim^gc+^x b(z). Constant /3 is thus the unique solution to (fT2l) . 
A closed form expression for 7 can as well be obtained using (for instance) [[TT|. Using the 
fact that the limiting spectral measure associated with F has a bounded support, the dominated 
convergence Theorem applies to the function x h-)- Jlog(x + t)dF{t). One easily obtains after 
some algebra: 

Following [fm . we conclude that 7 = C(l) where C is the function defined for each a; > by: 

C{x) = -1 + xb(-x) - log(xb(-x)) + - J log(l + ctb(-x))d<l>(t) . 

This statement can simply be proved by noting that C'{t) = j — b(— t) (where C is the derivative 
of C) and C(oo) =0. Plugging the above expression of 7 into (|29l ), we obtain the claimed error 
exponent i^rnd- 



B. Proof of Theorem |2] 

We start with some useful definitions and technical preliminaries. Let c G (0, 1). Denote 
TUf = inf(/) Mf = sup(/) so that, by definition of $ in ([TOl), ^(t) = for all t < nif and 
<l>(t) = 1 for all t> Mf.We define the set 

Ae = {AG(m;,M;]:<l>(A)>(l-c)}. 

By assumptions Al and A2, $ is continuously strictly increasing from [uif.Mf] to [0,1], we 
denote by its inverse continuous function defined from [0,1] to [mj,Mj]. Hence Ac = 
[$"^(1 — c),Mf]. Moreover, using again A2, we obtain that is almost surely continuous 
with respect to Leb o f^^. By the uniform mapping theorem, this implies that, for any sequence 
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of probability measures weakly converging to (27r) ^Lebo/ ^, we have, for all continuous 
function : M — )■ M, as — oo, 

/ (7(A)d^„(A) [ g{X)d{Leh o f-'}{X) = ^ [ go f{uj)du; (30) 

where the last equality follows from the definition of Ac in (fTSi) by setting A = f{uj). 

1) The PCS case: The outline of the proof is the following. 
Step 1. Assume that 

k = max{i e{l,...,n} : $(A^) > 1 - c} , (31) 

where (A")i<i<„ is given in Definition [T] Then k satisfies (|7]) and strategy V has error 
exponent /iorth(c). 

Step 2. Strategy V with any sequence k satisfying dV]) also has the error exponent Korthic). 
Step 3. Under Condition (|7]) strategy V is optimal among all orthogonal strategies, that is, (fTTI) 
holds for any A. 

StepUl Let fin = ^ X]"=i denote the empirical spectral measure of r„ defined in (H)). Szego's 
Theorem states that /i„ converges weakly to 2^Leb o (|7j|, p. 64). Applying (l30l) and then 
Lemma [5] then gives 

1=1 

That is, A; defined by (|3T1) satisfies (|7]). Recall that here \4 is given in Definition [T] with k given by 
in (|3TI) . The empirical spectral measure of VjTnVn + h is thus given by ^ X]r=i ^i+Af IacI-^D- 
Hence we have as above that 

lim -logdet(Kfr„K + 4) = 7^ / log(l + /(c^))da; , (32) 

lim Tr [(Kf r.K + 4)-'] = ^ / T-V^^^ • (33) 

n^oo ' 27r 1 + 

The spectral radius p[(l^/r„\4 + 4)"^] is bounded by 1/(1 + $-^(1 - c)). Using eqs (l20l) . 
(I32l)-(l33l). Lemma [3] and (fT6l) . we obtain that, under i/o, "^-^Vn Kovth{c). As a consequence 
of Lemma [H we obtain the assertion of Step 1. 

Step\^ Observe that the error exponent associated to a strategy V is increasing with k. Now let 
A; be a sequence satisfying ([7]). For any c' and c" such that c' < c < c", define k' and k" by (|3TI) 
with c replaced by c' and c" respectively. Then, as seen in Step [H k' and k" also satisfy (|7]) with 
c replaced by c' and c" respectively. Thus, eventually, k' < k < k", and, applying Step [H the 
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error exponent of V belongs to [J<'orth(c'), -^orth(c")]- This, with the continuity of -K'orth, yields 
the assertion of Step [21 

StepO Assume A = (An) now denotes any orthogonal, that is A„ is a n x A; orthogonal matrix 
for all n, where k satisfies (|7]). Let us prove that the bound (flTl) holds. By Lemma [H for all real 
t, we have 



lim inf Pn 



n 



K^{A) < t . 



(34) 



Let t > lim sup„_^oo ^^0 i^^An/n]. Using Markov inequality, we have for n large enough. 



Pn 



n 



< 



Using (l20l) . we have 

Vaio 



-^A„ 
n 



^Varo [z^ {h - (4 + AlT^A^)-'} z] 



where the convergence follows from Lemma [3] by noticing that Ik — (Ik + A'^TnAn) has 
eigenvalues in [0,1] and, under Hq (recall that we set cr^ = 1), z ~ J\f{0,lk). The last two 



displays show that Pq [—^J^a„ > t] — )■ as n — )■ oo for all t > lim sup ^ 



En 



-^A„/n]. 



With dMl), we get that 



Ka{A) < lim sup Eo [-CA„/n] 



To conclude the proof, it thus only remains to show that lim sup^ 



En 



-CA„/n] < A"orth(c). 



We have, by 

Eo [-CaJ = -k + logdet(^^r„A„ + Ik) + Tr ((A^T^A, + 4)"^) . 

Since x H- logx + 1/x is nondecreasing on [1, +oo[. Lemma |4] thus implies that Eq [— < 
Eo [— £v;J. We proved in Step [U that Eq [—Cv„/n] A^orth(c). Hence the proof is achieved. 

2) The PFS case: We now prove that the PFS strategy also achieves the error exponent 
-R'orth(c) under the condition dV]). Using the same argument as in Step [2l of the PCS case, we 
can in fact take k as defined by 



k = max{i E {1,. . .,n} : $ o /(27rj7_ J > 1 - c} 



(35) 



where (jj")o<j<n is given in Definition [2l which we assume in the following. 

It is known ( [[T6l . Lemma 4.6) that r„ defined in & is asymptotically equivalent to F^DnFn 
where -D„ denotes the n x n diagonal matrix with entries f{2Txk/n), A; = 0, . . . , — 1. As in 
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IfTTl . we denote asymptotic equivalence between matrices An and -B„ by An ^ Bn- Asymptotic 
equivalence is preserved by elementary matrix operations ( ifTTl . Proposition 2.1). Hence, F„ 
being unitary, 

D F T 

Also, from the definition of Wn, 

Wn = F^Sn , 

where Sn is a nx k selection matrix the columns of which belong to the canonical basis. Hence, 
Sn being unitary, 

S^DnSn ~ W^TnWn . (36) 

Eq (l36l) implies 

lun-logdetiW^TnWn + h) = lim - V log(l + /(27rA;/n)) 

n^oo 77, n— !>oo n ' 

PF"{/;c) 



1 
2^ 



log(l + f{uj))du , 



And 



\imTT[{W^rnWn + hr'] = lim- —-^ 



271 1 + f{L0) 

We then conclude as in Step [T] of the PCS case. 



, _ , f(2nk/n) 

PF"(/;c) •' ^ ' ' 



C. Proof of Theorem |J] 

Again we can take k defined by (|3T1) without loss of generality. 

Let Dn denote the n x n diagonal matrix with entries -D„(£, = /(27r£/n), £ = 0, . . . , n — 1, 
where we defined the function 

/» = (— ^ - ^ ^ „, 1($ o /(c.) > 1 - c) . 
VI + (7^ l + a^ + /(a;)y 

Then by Definition [2] and (fT9l) . we have 

Tn = u^M„u„ , 
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where M„ = Fn^nFn and u„ is a n-sample of a centered stationary Gaussian process with 
spectral density gi{ijj) = 1 + + f{uj) under Hi and go{uj) = 1 + cr^ under Hq. We shall 
apply [fT8l Propostion 2] which provides a large deviation principle (LDP) for quadratic forms 
of stationary Gaussian processes. Recall that we denote by Tn{g) the n x n covariance matrix 
associated to the spectral density g (see ©). Let Sn = Sp(T„(5()^/^M„T„(5()^/2) denote the 
set of eigenvalues of Tn{gY^'^ MnTn{gY^'^ . Since M„ is non-negative, to apply this result, we 
successively show that for g = go or g = gi, 

(i) dn = max(S'„) is bounded above by M^Mg, 

(ii) the following weak convergence holds n^^ J2\eSn ^ 2^-'^'^^ ° Ifd]"^' 

(iii) a„ — Mj^g as n — )■ oo. 

Observe that the eigenvalues of Dn are given by f{27ii/n), with i = 0, . . . ,n — 1, and those of 
Tn{g) are bounded by Mg. Hence we have (i). Assertion (ii) is a consequence of Lemma 5 in 
lfT6l and Theorem 2.1 in lfTTl . By (i) and (ii), we have 

Um sup a„ < MjMg and Mj^ < lim inf a„ . 

Thus Assertion (iii) follows by observing that /, go and gi achieve their maxima at the same 
points, thus Mj^ = MjMg for g = go or gi. Since Assertions (i)-(iii) hold, Propostion 3 and 
Corollary 2 in [fTSl give that for i = 0, 1, under Hi, n~^Tn satisfies a LDP with good rate 
function 

/,(x) = sup (yx+^ [ log(l - 2y[fg,]{co)) dco] . (37) 

with g = go or g = gi under Ho or Hi, respectively. As in IfTSl , we assume for convenience 
that log(x) = — oo when x < 0. 

Assertion (ii) above also implies that n~^Tn A ^ J^^[fg]{uj)duj with the same convention 
for g. Hence the sequence (?7„(a)) in Theorem [3] satisfies n^^7]n{a) xo := ^ j'^Jyf go]{!jo)duj . 
Thus the LDP under Hi gives 

lim sup — log Pi (7^ < r]n{a)) < inf limsup — logPi(r2~^7^ < c) < inf sup— /i(x) , 

and 

lim inf — log Pi (7^ > sup lim inf — log Pi (n^^ 7^ < c) > supsup— /i(x) , 

n-!>oo n c<xo n c<xo x<c 
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By Lemma [6l we conclude that 

lim --logPi(7^ < r]n{a)) = inf Ii{x) . 

n^oo fl x<xo 

To conclude the proof, it only remains to show that /i(xo) = Ka{yV^) and is nonincreasing 
on (—00, xo]. By definition of / and Ji, we have h{x) = snpyFx{y), where 

F,{y) =yx + ^ I log(l - 2y[gi/go - du . 

J Ac 

Using the definition of a;o, we further have yxo = y[l — go/gi\{u)duj and hence F^oiy) = 
±^J^J^{2y) du with 

fuj{y) = y[l - 9o/9i]{(^) + log(l - y[gi/go - • 

For any co, it is straightforward to show that f^{y) is maximized at ?/ = — 1 at which it takes 
value 1) = 2D {Af{0, go{u!)) \ \Af{0,gi{uj)). Since the maximizing f^{y) does not depend 



on u we obtain 



Ji(xo) = sup / fu;{2y) du 

= ^ [ snpU2y)du; = lUm. 



We now consider x < xq. By differentiating F^, the y maximizing Fx{y) satisfies 

A, l-2!/|j,/9o-llM 
Note that gi/go — 1 is non-negative and has a positive integral on Ac hence the right-hand side 

of the previous display has a strictly positive derivative w.r.t. y. It follows that y{x), defined as 
the y maximizing Fx{y), is strictly increasing with x. On the other hand, we know from above 
that |/(xo) = —1/2. Thus, for all x < xq, we have h{x) = supj,<„i/2 -^^(?/)- Now observe that 
for all x' < X and all y < we have Fxi{y) — Fx{y) = (x' — x)y > 0. It follows that Ji is 
nonincreasing on (— oo,xo], which achieves the proof. 

Appendix B 
Technical lemmas 

Lemma 3 Assume that for each n > 0, x„ ~ A/'(0, S„) where S„ has bounded spectral radius 
piX^n); and assume Qn is a family of quadratic forms with bounded spectral radius p{Qn)- Then, 

Var[-x^Q„x„] ''^'^ . 
n 
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n— >co n 



Then ^{x'^QnXn) converges in the L2 sense towards c. 



Proof: One has E[a;' 



'QnXn] = Tr[(5„S„]. Let us estimate Var[x^Q„a;„]. 




with Un a standard centered gaussian vector and A„ diagonal and congruent to S^Qn^n- 

V&i[xlQnXn] = 2Tr[A^] < 2np{Al) < 2np(S„)V(Qn)' <C-n 
where C is a constant. Thus we have, as sought, 

Var[— -aj^Q^aJn] -^^^^ . 

■ 

Lemma 4 ( Ifl9]| , p. 189) Lef Q be a symmetric nxn matrix, and V be a r -dimensional subspace 
0/ M". Denote by Qy the restriction of Q to V, Aj, i G {l,...,n}, the eigenvalues of Q in 
increasing order and fij, j G {1, . . . ,r}, the eigenvalues of Qy in increasing order Then, for 
all i = 1, . . .r, we have Xi < fj,i < Xn+i-r- 

Lemma 5 Under A1-A2, we have for any c G [0, 1], Leb(Ac) = 27rc, where Ac is defined 



Proof: We have Ac = ($ o — c, 00)) n (— vr, vr), where ($ o f)~^ denotes the inverse 

image under $0/. Observe that ($o/)~^ = /~^o$~^. Moreover as we have seen in the preamble 
of Appendix lA-Bl $ is continuously and strictly increasing from [m/, Mf] to [0, 1] and constant 
on [Mf, 00), hence $""'^([1 — c, 00)) = [$^"'^(1 — c),oo), where here denotes the inverse 
function from [0, 1] to [0, Mf]. Hence Ac = — c), 00). Now since $ is the distribution 

function of the probability measure (27r)^^Lebo /^^ and using again that it is continuously and 
strictly increasing from [0, Mf] to [0, 1], we get that (27r)~^Leb(Ac) = 1 — (1 — c) = c, which 
concludes the proof. ■ 

Lemma 6 Let I{x) be defined for x G M 071) with values in M U {00} /or some non-negative 
bounded function h = [fg]. Then I{x) is finite and continuous for x > 0. 



by (EiD. 
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Proof: Let J^{y) = yx + JlJ^gil - 2y[fg]{uj)) dw so that I{x) = sup^ J^{y). Let 
denote the essential sup of h. Then Jxiv) = — oo for all y > l/(2M/j). Let e > 0. Note that 
Jx{^) = and for all a; > e and y < 0, Jxiu) < yx + log(l — 2yMh)/2 — )■ — oo as y — )■ —oo. 
Thus there exists y^ only depending on e such that Jx{y) < for all x > e and y < y^. From 
these facts, it follows that for all x > e, I{x) = ^y^Vy(^[y^,i/{2Mh)] \ Jx{y)\ - Finally we observe that 
for all x,x' > e, ?>Vi^y(:[y^A/(2Mh)] IMv) - Jx'iy)\ < {-ye V l/{2Mh)) \x - x'\ which now yields 
the result. ■ 
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